Using Networks to Combine “Big Data” and Traditional Surveillance to Improve Influenza Predictions
نویسندگان
چکیده
Seasonal influenza infects approximately 5-20% of the U.S. population every year, resulting in over 200,000 hospitalizations. The ability to more accurately assess infection levels and predict which regions have higher infection risk in future time periods can instruct targeted prevention and treatment efforts, especially during epidemics. Google Flu Trends (GFT) has generated significant hope that "big data" can be an effective tool for estimating disease burden and spread. The estimates generated by GFT come in real-time--two weeks earlier than traditional surveillance data collected by the U.S. Centers for Disease Control and Prevention (CDC). However, GFT had some infamous errors and is significantly less accurate at tracking laboratory-confirmed cases than syndromic influenza-like illness (ILI) cases. We construct an empirical network using CDC data and combine this with GFT to substantially improve its performance. This improved model predicts infections one week into the future as well as GFT predicts the present and does particularly well in regions that are most likely to facilitate influenza spread and during epidemics.
منابع مشابه
Using Participatory Web-based Surveillance Data to Improve Seasonal Influenza Forecasting in Italy
Traditional surveillance of seasonal influenza is generally affected by reporting lags of at least one week and by continuous revisions of the numbers initially released. As a consequence, influenza forecasts are often limited by the time required to collect new and accurate data. On the other hand, the availability of novel data streams for disease detection can help in overcoming these issues...
متن کاملA data-driven model for influenza transmission incorporating media effects
Numerous studies have attempted to model the effect of mass media on the transmission of diseases such as influenza; however, quantitative data on media engagement has until recently been difficult to obtain. With the recent explosion of 'big data' coming from online social media and the like, large volumes of data on a population's engagement with mass media during an epidemic are becoming ava...
متن کاملPenalized Lasso Methods in Health Data: application to trauma and influenza data of Kerman
Background: Two main issues that challenge model building are number of Events Per Variable and multicollinearity among exploratory variables. Our aim is to review statistical methods that tackle these issues with emphasize on penalized Lasso regression model. The present study aimed to explain problems of traditional regressions due to small sample size and m...
متن کامل2016 Olympic Games on Twitter: Sentiment Analysis of Sports Fans Tweets using Big Data Framework
Big data analytics is one of the most important subjects in computer science. Today, due to the increasing expansion of Web technology, a large amount of data is available to researchers. Extracting information from these data is one of the requirements for many organizations and business centers. In recent years, the massive amount of Twitter's social networking data has become a platform for ...
متن کاملMonitoring of Regional Low-Flow Frequency Using Artificial Neural Networks
Ecosystem of arid and semiarid regions of the world, much of the country lies in the sensitive and fragile environment Canvases are that factors in the extinction and destruction are easily destroyed in this paper, artificial neural networks (ANNs) are introduced to obtain improved regional low-flow estimates at ungauged sites. A multilayer perceptron (MLP) network is used to identify the funct...
متن کامل